Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 16277 #16716

Closed
wants to merge 20 commits into from
Closed

Issue 16277 #16716

wants to merge 20 commits into from

Conversation

Nekmo
Copy link

@Nekmo Nekmo commented Jun 12, 2018


Fixed Issue #16277: Atresplayer broken: ERROR: Unsupported URL

@Nekmo
Copy link
Author

Nekmo commented Jun 12, 2018

Fixed Python2 support :( (urllib.error)

@Nekmo
Copy link
Author

Nekmo commented Jun 13, 2018

Please check again "do-not-merge" label cc/ @dstftw

@Aleixenandros
Copy link

¿Por qué no lo implementan en las nuevas versiones?

Funciona perfecto desde el repositorio de @Nekmo

try:
from urllib.error import HTTPError
except ImportError:
from urllib2 import HTTPError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compat_HTTPError.


class AtresPlayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/television/[^/]+/[^/]+/[^/]+/(?P<id>.+?)_\d+\.html'
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/' \
r'[^/]+/[^/]+/[^/_]+_(?P<id>[A-z0-9]+)/?'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/? is pointless at the end.


class AtresPlayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/television/[^/]+/[^/]+/[^/]+/(?P<id>.+?)_\d+\.html'
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/' \
r'[^/]+/[^/]+/[^/_]+_(?P<id>[A-z0-9]+)/?'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the regex on a single line.


class AtresPlayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/television/[^/]+/[^/]+/[^/]+/(?P<id>.+?)_\d+\.html'
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/' \
r'[^/]+/[^/]+/[^/_]+_(?P<id>[A-z0-9]+)/?'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A-z incorrect.

_NETRC_MACHINE = 'atresplayer'
_TESTS = [
{
'url': 'http://www.atresplayer.com/television/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_2014122100174.html',
'md5': 'efd56753cda1bb64df52a3074f62e38a',
'url': 'https://www.atresplayer.com/lasexta/programas/el-'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't split. Same everywhere.

}

self._download_webpage(self._LOGIN_URL, None, 'get login page')
request = sanitized_Request(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline into actual _download_* call.

Copy link
Author

@Nekmo Nekmo Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment.

Copy link
Contributor

@bato3 bato3 Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nekmo You can make json POST with _download_json, and set expected statuses

for POST use data=urlencode_postdata(form_data)

From common.py

      def _download_json(
            self, url_or_request, video_id, note='Downloading JSON metadata',
            errnote='Unable to download JSON metadata', transform_source=None,
            fatal=True, encoding=None, data=None, headers={}, query={},
            expected_status=None):
        """
        Return the JSON object as a dict.

        See _download_webpage docstring for arguments specification.
        """

Copy link
Author

@Nekmo Nekmo Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response from the server is not a json. This request is to set cookies and session.

except JSONDecodeError:
return original_exception
if isinstance(data, dict) and 'error' in data:
return ExtractorError('{} returned error: {} ({})'.format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No {}.

raise self._atres_player_error(e.exc_info[1].file.read(), e)

for source in video_data['sources']:
if source['type'] == "application/dash+xml":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single quotes.

raise self._atres_player_error(e.exc_info[1].file.read(), e)

for source in video_data['sources']:
if source['type'] == "application/dash+xml":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not break if no type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eg find type by URL ext

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed on PR bbb857c

'thumbnail': thumbnail,
'duration': duration,
'title': video_data['titulo'],
'description': video_data['descripcion'],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read coding conventions on optional/mandatory meta fields.

@pereorga
Copy link

pereorga commented Apr 8, 2019

FWIW, this is working fine locally, on both Windows 10 (Python 3.7.3) and Debian stable (Python 3.5.3).

It does not work, however, on an Ubuntu 16.04 instance that I have in DigitalOcean (Python 3.5.2). I guess it may be because of IP address-based geo-blocking? I am getting an HTTP 403:

~/dev/youtube-dl/youtube_dl$ python3 ./__main__.py -uUSERNAME -pPASSWORD https://www.atresplayer.com/lasexta/programas/salvados/temporada-14/francisco_5c9f49237ed1a885b9056c1f/

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', '-uUSERNAME', '-pPASSWORD', 'https://www.atresplayer.com/lasexta/programas/salvados/temporada-14/francisco_5c9f49237ed1a885b9056c1f/']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2018.06.11
[debug] Git HEAD: bbb857c
[debug] Python version 3.5.2 (CPython) - Linux-4.4.0-143-generic-x86_64-with-Ubuntu-16.04-xenial
[debug] exe versions: none
[debug] Proxy map: {}
[AtresPlayer] get login page
[AtresPlayer] post to login form
[AtresPlayer] Set login session
[AtresPlayer] 5c9f49237ed1a885b9056c1f: Downloading player JSON
[AtresPlayer] 5c9f49237ed1a885b9056c1f: Downloading video JSON
Traceback (most recent call last):
  File "/home/netol/dev/youtube-dl/youtube_dl/extractor/common.py", line 579, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/home/netol/dev/youtube-dl/youtube_dl/YoutubeDL.py", line 2211, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.5/urllib/request.py", line 472, in open
    response = meth(req, response)
  File "/usr/lib/python3.5/urllib/request.py", line 582, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.5/urllib/request.py", line 510, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 590, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/netol/dev/youtube-dl/youtube_dl/extractor/atresplayer.py", line 109, in _real_extract
    fatal=True)
  File "/home/netol/dev/youtube-dl/youtube_dl/extractor/common.py", line 767, in _download_json
    data=data, headers=headers, query=query)
  File "/home/netol/dev/youtube-dl/youtube_dl/extractor/common.py", line 752, in _download_json_handle
    encoding=encoding, data=data, headers=headers, query=query)
  File "/home/netol/dev/youtube-dl/youtube_dl/extractor/common.py", line 599, in _download_webpage_handle
    urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query)
  File "/home/netol/dev/youtube-dl/youtube_dl/extractor/common.py", line 588, in _request_webpage
    raise ExtractorError(errmsg, sys.exc_info()[2], cause=err)
youtube_dl.utils.ExtractorError: Unable to download JSON metadata: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./__main__.py", line 19, in <module>
    youtube_dl.main()
  File "/home/netol/dev/youtube-dl/youtube_dl/__init__.py", line 472, in main
    _real_main(argv)
  File "/home/netol/dev/youtube-dl/youtube_dl/__init__.py", line 462, in _real_main
    retcode = ydl.download(all_urls)
  File "/home/netol/dev/youtube-dl/youtube_dl/YoutubeDL.py", line 2001, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "/home/netol/dev/youtube-dl/youtube_dl/YoutubeDL.py", line 792, in extract_info
    ie_result = ie.extract(url)
  File "/home/netol/dev/youtube-dl/youtube_dl/extractor/common.py", line 500, in extract
    ie_result = self._real_extract(url)
  File "/home/netol/dev/youtube-dl/youtube_dl/extractor/atresplayer.py", line 113, in _real_extract
    raise self._atres_player_error(e.exc_info[1].file.read(), e)
  File "/home/netol/dev/youtube-dl/youtube_dl/extractor/atresplayer.py", line 80, in _atres_player_error
    data = json.loads(body_response)
  File "/usr/lib/python3.5/json/__init__.py", line 312, in loads
    s.__class__.__name__))
TypeError: the JSON object must be str, not 'bytes'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants